1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 



CLAIMS 



We claim: 




A method comprising: 
^ \\ receiving an entered string; and 

Y • 

determining how likely a word was to have been entered as the string based 
on at least one\edit operation that converts a first character sequence of arbitrary 
length in the worn to a second character sequence of arbitrary length in the string. 



2. A metraod as recited in claim 1, wherein the first character sequence 
has a first length anu the second character sequence has a second length that is 
different than the first uength. 

3. A method as recited in claim 1, wherein the first character sequence 
has multiple characters ana the second character sequence has multiple characters. 

4. A method as reciW in claim 1, wherein the first character sequence 
has a first number of multiple characters and the second character sequence has a 
second number of multiple characters that is different from the first number of 
multiple characters. 

5. A method as recited in claim 1 and further comprising determining 
how likely the word is to have been generated. 
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6. A method as recited in claim 1 and further comprising conditioning 
the edit operation on a position that the adit occurs at within the word. 

7. A method as recited in claim 1 and further comprising identifying the 
string as potentially incorrect. \ 

8. A method as recited in claim u and further comprising correcting the 
string to the word. \ 

9. A computer readable medium having computer-executable 
instructions that, when executed on a processor, perform the method as recited in 
claim 1 . \ 

A method comprising: \ 
receiving an entered string s; and \ 

determining a probability J>(s\w) expressingmow likely a word w was to 
have been incorrectly entered as the string s based on one or more edit operations 
that convert first arbitrary-length character sequences\ai, a 2 , a 3 , a n in the 
word w to corresponding second arbitrary-length character sequences pi, p 2 , p3> 
. . p n in the string s, wherein: \ 

?(s\w) = P(p ! |a 0 * P(P 2 |a 2) * P(P 3 |a 3) * ■ • • *V (P n|a n ) 
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11. A method as recited in claim 10, wherein lengths of corresponding 
first and second character sequences are different. 

12. A method as recited in claim 10 and further comprising determining 
how likely the word w is to have been generated. 

13. A method as recited in claim 10 and further comprising conditioning 
the edit operations on positions that the eaits occur at within the word. 

14. A method as recited in claim 10 and further comprising correcting 
the string s to the word w. \ 

15. A method as recited in claim 10 and further comprising identifying 
the string s as potentially incorrect. 1 

16. A computer readable mepium having computer-executable 
instructions that, when executed on a processor, perform the method as recited in 
claim 10. \ 



A method comprising: \ 
receiving an entered string s\ and \ 

determining a probability P(s|w) expressing how likely a word w was to 
have been incorrectly entered as the string s, by partitioning the word w and the 
string s and computing probabilities for various partiftonings, as follows: 
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P(s \w)= Y. 



RePart 



\R\ 



P(R\w) X YlP(Ti\R,) 



TePart(s) i=\ 

m=i*i 



where Part(w) is a set of possible ways of partitioning the word w, Part(s) is a set 
of possible ways of partitioning the string s 9 R is a particular partition of the word 
w, and T is a particular partition of the string s. 

18. A method as recited in claim 17 and further comprising selecting the 
partition that returns a highest probability. 

19. A method as recited in alaim 17 and further comprising determining 
how likely the word w is to have been generated. 

20. A method as recited in clpim 17 and further comprising correcting 
the string s to the word w. 

21. A method as recited in clain^ 17 and further comprising identifying 
the string s as potentially incorrect. 



22. A computer readable medium having computer-executable 
instructions that, when executed on a processor, perform the method as recited in 
claim 17. 
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A method comprising: 

receiving an entered string s\ and . 

determining a probability P(s|w) expressing how likely a word w was to 
have been incorrectly entered as the string sj by partitioning the word w and the 
string s and computing probabilities for various partitionings, as follows: 



\R\ 



P(s|w) = maX R e Part(w). T ^Part(s) P(RJw)* ]""[ P(T | R.) 



i=l 



where Part(w) is a set of possible ways of partitioning the word w 9 Part(s) is a set 
of possible ways of partitioning the string s, R is a particular partition of the word 
w, and T is a particular partition of the string s. 

24. A method as recited in/ claim 23 and further comprising omitting the 
term P(R|w) from the computation of P(s|w). 



25. A method as recited in claim 23 and further comprising setting 
terms P(Ti|Ri) = 1 whenever Tj = P j. 

26. A method as recited in claim 23 and further comprising determining 
how likely the word w is to have been generated. 

27. A method as recited in claim 23 and further comprising correcting 
the string s to the word w\ 
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28. A method as recited in claim 23 and further comprising identifying 
the string s as potentially incorrect. 

29. A computer readable medium having computer-executable 
instructions that, when executed on a processor, perform the method as recited in 
claim 23. \ 

A method comprising: 

receiving an entered string ^ and 

determining a probability Pte|w) expressing how likely a word w was to 

have been incorrectly entered as the >string s, by partitioning the word w and the 

string s and finding a partition R of the word w and a partition T of the string s 

i*i \ 
such that Y[ P{Ti \ Ri) is maximized. \ 

31. A method as recited in claim 30 and further comprising determining 
how likely the word w is to have been generated. 

32. A method as recited in claim 30 and further comprising correcting 
the string s to the word w. \ 

33. A method as recited in claim 30 and nirther /comprising identifying 
the string s as potentially incorrect. \/ 
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• 



34. A comptiter readable medium having computer-executable 
instructions that, when executed on a processor, perform the method as recited in 
claim 30. \ 



comprising: \ 

determining, given a <wrongv right> training pair and multiple single 
character edits that convert characters\in one of the right or wrong strings to 
characters in the other of the right or wronY strings at differing costs, an alignment 
of the wrong string and the right string thatyesults is a least cost to convert the 
characters; \ 

collapsing any contiguous non-match editsNiito one or more common error 
regions, each error region containing one or more characters that can be converted 
to one or more other characters using a substitution edit\and 

computing a probability for each substitution edit. \ 

36. A method as recited in claim 35, wherein the assigning comprises 
assessing a cost of 0 to all match edits and a cost of 1 to all non-match edits. 

37. A method as recited in claim 35, wherein the single character edits 
comprises insertion, deletion, and substitution. \ 

38. A method as recited in claim 35, further comprising collecting 
multiple <wrong, right> training pairs from online resources. \ 




A method for training an error model used in a spell checker. 
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39. A method\s recited in claim 35, further comprising expanding each 
of the error regions to capmre at least one character on at least one side of the error 
region. \ 

A program embodied on a computer readable medium, which when 
executed, directs a computer to perform the following: 
receive an entered string; Wd 

determine how likely an expected string was to have been entered as the 
entered string based on at least one edit operation that converts a first character 
sequence of arbitrary length in the expected string to a second character sequence 
of arbitrary length in the entered strin A 

41. A program as recited in\claim 40, wherein the first character 
sequence has a first length and the second character sequence has a second length 
that is different than the first length. \ 

42. A program as recited in claim\40, wherein the first character 
sequence has multiple characters and the second Character sequence has multiple 
characters. \ 

43. A program as recited in claim 40, wherein the first character 
sequence has a first number of multiple characters anti the second character 
sequence has a second number of multiple characters that is different from the first 
number of multiple characters. \ 
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4 



44. A prograrh^as recited in claim 40, further comprising computer- 
executable instructions thai directs a computer to determine how likely the 
expected string is to have been generated. 



45. A program as recited in claim 40 5 further comprising computer- 
executable instructions that directs a computer to perform, depending upon how 
likely an expected string was to be Wcorrectly entered as the entered string, one of 
the following: (1) leave the entered Wing unchanged, (2) autocorrect the entered 
string into the expected string, or (3) offer a list of possible corrections. 

46. A spell checker program, embodied on a computer-readable 
medium, comprising the program of claim 40. 

47. A language conversion program\embodied on a computer-readable 
medium, comprising the program of claim 40. 

48. A word processing program, embodied on a computer-readable 
medium, comprising the program of claim 40. 



^9^^ A program embodied on a computer readable medium, which when 
executed, directs a computer to perform the following: 

(1) receive an entered string s; 

(2) for multiple words w in a dictionary, determine: 

(a) how likely a word w in a dictionary is to h^ve been generated, 
¥(w\context)\ and 
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4 



w likely the word w was to have been entered as the string 
s, P(s|w), biased on at least one edit operation that converts a first 
character sequence of arbitrary length in the word to a second 
character sequence of arbitrary length in the string; and 
(3) maximize ¥(s\w)rP(w\context) to identify which of the words is most 
likely the word intended when the string s was entered. 

50. A program as recited in claim 49, wherein the determination (2) is 
performed for all words in the dictionary. 

51. A program as reciteu in claim 49, further comprising computer- 
executable instructions that directs\a computer to either (1) leave the string 
unchanged, (2) autocorrect the string i^ito the word, or (3) offer a list of possible 
corrections. 

52. A spell checker program, \ embodied on a computer-readable 
medium, comprising the program of claim 49 \ 

53. A language conversion program, embodied on a computer-readable 
medium, comprising the program of claim 49. 



J>*^ A spell checker comprising: 
a source model component to determine ho\^ likely a word w in a 
dictionary is to have been generated; and 



Lee & Hayes, PLLC 



30 



03310OUO6 MS1-471US.PAT.APP.DOC 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 



# 



an error model component to determine how likely the word w was to have 
been incorrectly entered as the string s based on arbitrary length string-to-string 
transformations. \ 

55. A spell checker as recited in claim 54, wherein the string-to-string 
transformations involve conversion of a first character sequence of a first length 
into a second character sequence of a second length that is different than the first 
length. \ 

56. A spell checker as recited in claim 54, wherein the string-to-string 
transformations involve conversion of a first character sequence with multiple 
characters into a second character sequence with multiple characters. 

57. A spell checker as recited in claim 54, wherein the string-to-string 
transformations involve conversion ona first character sequence having a first 
number of multiple characters into a second character sequence having a second 
number of multiple characters that is different from the first number of multiple 
characters. \ 
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